Combining protein secondary structure prediction models with ensemble methods of optimal complexity
نویسندگان
چکیده
Many sophisticated methods are currently available to perform protein secondary structure prediction. Since they are frequently based on di,erent principles, and di,erent knowledge sources, signi>cant bene>ts can be expected from combining them. However, the choice of an appropriate combiner appears to be an issue in its own right. The >rst di@culty to overcome when combining prediction methods is over>tting. This is the reason why we investigate the implementation of Support Vector Machines to perform the task. A family of multi-class SVMs is introduced. Two of these machines are used to combine some of the current best protein secondary structure prediction methods. Their performance is consistently superior to the performance of the ensemble methods traditionally used in the >eld. They also outperform the decomposition approaches based on bi-class SVMs. Furthermore, initial experimental evidence suggests that their outputs could be processed by the biologist to perform higher-level treatments. c © 2003 Elsevier B.V. All rights reserved.
منابع مشابه
Protein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches
DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...
متن کاملCombining Statistical Models for Protein Secondary Structure Prediction
We investigate the problem of combining experts to predict the secondary structure of globular proteins. We first present two different statistical models for this task. We then analyse an efficient linear combination technique, this sheds light on unexplained phenomena frequently encountered in practice for ensemble methods.
متن کاملElectricity Load Forecasting by Combining Adaptive Neuro-fuzzy Inference System and Seasonal Auto-Regressive Integrated Moving Average
Nowadays, electricity load forecasting, as one of the most important areas, plays a crucial role in the economic process. What separates electricity from other commodities is the impossibility of storing it on a large scale and cost-effective construction of new power generation and distribution plants. Also, the existence of seasonality, nonlinear complexity, and ambiguity pattern in electrici...
متن کاملA General Method for Combining Predictors Tested on Protein Secondary Structure Prediction
Ensemble methods, which combine several classifiers, have been successfully applied to decrease generalization error of machine learning methods. For most ensemble methods the ensemble members are combined by weighted summation of the output, called the linear average predictor. The logarithmic opinion pool ensemble method uses a multiplicative combination of the ensemble members, which treats ...
متن کاملProfiles and Majority Voting-Based Ensemble Method for Protein Secondary Structure Prediction
Machine learning techniques have been widely applied to solve the problem of predicting protein secondary structure from the amino acid sequence. They have gained substantial success in this research area. Many methods have been used including k-Nearest Neighbors (k-NNs), Hidden Markov Models (HMMs), Artificial Neural Networks (ANNs) and Support Vector Machines (SVMs), which have attracted atte...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Neurocomputing
دوره 56 شماره
صفحات -
تاریخ انتشار 2004